Density functional calculations of planar DNA base-pairs 
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We present a systematic Density Functional Theory (DFT) study of geometries and energies of the 
nucleic acid DNA bases (guanine, adenine, cytosine and thymine) and 30 different DNA base-pairs. 
We use a recently developed linear-scaling DFT scheme, which is specially suited for systems with 
large numbers of atoms. As a first step towards the study of large DNA systems, in this work: (i) 
We establish the reliability of the approximations of our method (including pseudopotentials and 
basis sets) for the description of the hydrogen-bonded base pairs, by comparing our results with 
those of former calculations. We show that the interaction energies at Hartree-Fock geometries are 
in very good agreement with those of second order M0ller-Plesset (MP2) perturbation theory (the 
most accurate technique that can be applied at present for system of the sizes of the base-pairs), (ii) 
We perform DFT structural optimizations for the 30 different DNA base-pairs, only three of which 
had been previously studied with DFT. Our results provide information on the effect of correlation 
on the structure of the other 27 base pairs, for which only Hartree-Fock geometries were formerly 
available. 



I. INTRODUCTION 



Hydrogen bonds between complementary purine- 
pyrimidine bases play a significant role in the bonding 
bctwa<;n the two strands of the double helix structures of 
DNAIH. Nevertheless, other factors are also of paramount 
importance in determiniriff the structure of the helix. For 
instance, base-stackin^Q plays a crucial role in preserv- 
ing the hydrophobe aromatic rings from interacting with 
water molecules, besides contributing to increase iJie Van 
der Waals interactions. Also, both counterionsQfl and 
water molecules are important in screening the electro- 
static repulsion between the negatively charged phos- 
phate groups. The theoretical study of these systems and 
the effects of each type of interaction have been hindered 
by the great complexity of the calculations, both due to 
the difficulty in treating such different interactions, and 
to the large number of atoms involved. 

Although some progress has been done recentlyJ3 semi- 
empirical quantimi chemistry (QC) methods and empir- 
ical force fieldg3'El are generally not accurate enough to 
describe these systems. The most reliable procedure is 
undoubtedly the ab-initio QC approach, in which the ac- 
curacy of the calculation can be systematically improved 
by increasing the size and qualily of the basis set and 
the level of correlation includedtJ. This can be a power- 
ful method to study DNA and other biological systems, 
and a unique tool to address some of their properties. 
However, the intensive numerical effort required by these 
methods poses a serious limitation to the system sizes 



that-jCan be handled at a satisfactory level of the the- 
oryllj, a fact which has precluded their use in realistic 
biological systems. 

An alternative, which allows calculations for system 
sizes well beyond the limits of the traditional jflitipitio QC 
methods, is the Density Functional TheoryBB (DFT). 
At present, it appears that DFT is the only first prin- 
ciples method potentially capable of handling the large 
sizes involved, although the standard DFT techniques 
are still too expensive to solve systems with more than a 
very few hundreds of atoms, like those in DNA molecules. 
If the DFT methods are to make an impact in biologi- 
cal systems, it is neccesary to be able to go beyond the 
current size limits, but maintaining the current accu- 
racy. In this context, an extremely promising develop- 
ment has been the recent search for computational tech- 
niques in which the numerical effort scales only linearly 
with system size: the so called 'order-N' methods (see 
Ref. ^ for a review). They open, for the first time, 
the possibility of performing calculations in very large 
molecules, and have already been applied to the study 
of DNA chains with many hun|dreds of atoms by means 
of semi-empirical Hamiltonianala and approximate, non- 
selfconsistent DFTEI. 

The application of order-N techniques in the context 
of fully first-principles, selfconsistent DFT calculations is, 
in general, in a less advanced development state. Nev- 
ertheless, we have recently proposed a DFT method and 
the corresponding computer code SIESTA, with order-N 
scaling, which is able to do such calculations in systems 
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wit h|-thpu sands of atoms, in sip|dej-processor worksta- 
tional3ilj. Our preliminary testsllZlll3 have demonstrated 
that the method is able to treat systems as large as a 
whole turn of a DNA chain with more than 650 atoms, 
therefore opening the possibility of studying large biolog- 
ical molecules from first principles. Work in this,dircction 
is underway, and will be published elsewhere. c2l 

The purpose of this paper, as a first step in the ap- 
plication of the method to complex biological systems, 
and in particular to DNA molecules, is to make a thor- 
ough study of isolated nitrogenated bases and hydrogen- 
bonded base-pairs. This study serves to validate the 
present method (basis orbitals, approximations and nu- 
merical techniques) for the study of these base-pairs, by 
comparing the energies and geometries with those of pre- 
vious calculations, where available. The results presented 
here show that our method provides a very accurate de- 
scription of these systems, with the advantage of being 
considerably fast and, as mentioned, capable of reaching 
very large system sizes. Besides, we provide a systematic 
DFT study of the structures of the different base-pairs, 
and the effect of the relaxations on the interaction ener- 
gies. The use of our DFT scheme to obtain equilibrium 
geometries has the advantage that it includes correlation 
effects, which are-|absent in the available Hartree-Fock 
(HF) geometries. Ell At the same time, it is computation- 
ally feasible, unlike the M0ller-Plesset second order per- 
turbation theoryB (MP2) method. 

The rest of this paper is organized as follows. In Sec- 
tion H we discuss the details of our DFT method and 



of the calculations performed. Section III describes the 
energetics of the base-pairs at the available HF geome- 
tries, comparing our results with those of MP2 calcula- 
tions in the literature. Section IV presents our results for 
the DFT structural relaxations. Finally, in Section ^ we 
present the conclusions of this work. 



II. METHOD AND CALCULATIONS DETAILS 
A. The SIESTA method 

All our calculations have been done with SIESTaEI, a 
code for DFT calculations in systems with a large number 
of atoms, in which the cost of the calculation (both in 
memory and CPU time) scales linearly with the size of 
the system. Here we give only a brief description of the 
basic approximations involved in the calculation, whereas 
a detailed description can be found in Refs. [l^ , [l9| . 

We treat exchange-correlation (XCX-within the.frame- 
work of the Kohn-Sham formulationllj of DFTO. It is 
rather well_kpawn from many calculations in a variety 
of systemscj tJ that a correct description of hydrogen- 
bonds requires the use of non-local XC functionals. The 
Local Density Approximation (LDA) yields bond dis- 
tances in the hydrogen bonds which are about 10—15% 
shorter and binding energies about 50—70% larger than 



the experimental values. Inclusion of gradient correc- 
tions in several Generalized Gradient Approximations 
(GGA) functionals improves the description dramati- 
cally, achieving levels of accuracy an order of magnitude 
better than LDA. In this work we have used the first 
principles GGA functional proposed recently by Perdew, 
Burke and ErnzerhoiEa (PBE). 

SIESTA uses non-local, norm-conserving pseudopoten- 
tials to eliminate the core electrons from the calculation, 
and to produce a smoother valence charge density. In this 
work, the pseudopotentials are obtained from first pria- 
ciples, following the scheme of TrouUicr and MartinsEa. 
The valence electrons are described using the linear com- 
bination of atomic orbitals (LCAO) approximation. 

An essential ingredient for the linear scaling within this 
approach is the finite range of the matrix elements be- 
tween atomic orbitals. To achieve it, we userbasis orbitals 
which strictly vanish beyond a cutoff radiuaEl (instead of 
the usual approach of using decaying orbitals and neglect- 
ing matrix elements by whatever criterion). The main 
advantage is consistency: given a basis, the eigenvalue 
problem is solved for the full Hamiltonian. Thus, the pro- 
cedure is numerically very stable even for short ranges, 
in contrast with the usual approach. Since the computa- 
tional load grows substantially with the basis range, it is 
important to work with basis functions that display fast 
convergence for short orbital ranges. We have developed 
a scheme for finite range basis set generation which we 
will now outline .Ej 

In this and previous works, the radial parts of the 
finite-range orbitals were determined pip the spirit of 
the method of Sankey and NiklewskiJ^ who proposed 
a scheme for minimal (single-C) bases that we have gen- 
eralized to arbitrarily complete sets. The single-C or- 
bitals are obtained by solving the DFT atomic problem 
(including the pseudopotential) with the boundary con- 
dition for the orbitals of being zero beyond the cutoff 
radius, while remaining continuous. For the efficient gen- 
eration of larger, more complete basis sets we have used 
the ideas developed within the QC community over the 
years, incorporating them into new schemes adapted to 
numerical, finite-range bases for linear scaling. Numeri- 
cal multip] £- | C b ases are constructed in the split-valence 
philosophytSlij. Given an atomic orbital, it is split into 
two or more functions. The first splitting is made by in- 
troducing a smooth function that reproduces exactly the 
tail of the original orbital beyond a specified radius. The 
difference between the original orbital and this smooth 
function is an orbital with an even shorter range. Multi- 
ple splits are obtained by repeating the pipocedure. Our 
approach also allows polarization orbitalala. These are 
obtained by numerically solving the problem of the iso- 
lated atom in the presence of a polarizing electric field. 
Comparing the solutiop-|With a perturbative expansion 
(Sternheimer equationsEa) gives the shape of the wanted 
polarization orbitals. The cutoff radius of the polariza- 
tion orbitals is therefore the same as the one of the shell 
being polarized. 
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In all the calculations presented in this work we have 
used a double-^ (split valence) basis with polarization 
functions in all the atoms (including hydrogen). The 
cutoff radii for the atomic orbitals of each element can be 
seen iil. Table | as were obtained by fixing a confinement 
energyEy of 50 meV. 

The matrix elements of the different terms of the Kohn- 
Sham Hamiltonian are calculated in one of two differ- 
ent waysLj. The terms that involve integrals over two 
atoms only (kinetic energy, overlap, and other terms re- 
lated with the pseudopotential, see below) are computed 
a priori as a function of the distance between the cen- 
ters, and stored in tables to be interpolated later with 
very little use of time and memory. The other terms 
are calculated with the help of a uniform grid of points 
in real space. The smoothness of the integrands deter- 
mines how fine a grid is needed, and, of course, the finer 
the grid, the more expensive the calculation. We remark 
that the use of pseudopotentials, which eliminates the 
rapidly varying core charge, is essential to provide func- 
tions smooth enough to make the grid integration fea- 
sible. This fineness is measured by the energy of the 
shortest wave-length plane-wave that can be described 
with the grid, in analogy with plane- wave calculations. 
In all the calculations presented here, we have used a 
cutoff of 125 Ry. 

The calculation of the pseudopotential matrix elemeruta 
is done very efhciently using the Kleinman-BylanderO 
factorized form. It allows the three-center integrals of the 
pseudopotential between atomic orbitals to be treated as 
products of two-center integrals, which are tabulated as 
described above. 

With the bases, approximations and techniques de- 
scribed so far, the Kohn-Sham hamiltonian is built up 
with order-N operations. The solution to the eigenvalue 
problem can also be obtained with a linear scaling ef- 
fort using rtechpiques recently developed, and available 
in SIESTAlld'Eil. For the small systems considered here, 
however, the straight diagonalization (which scales as the 
cube of the number of orbitals) requires very little effort, 
and therefore has been used in this work to solve the 
Kohn-Sham eigenvalue equations. 



B. Details of the calculations 

In order to reach reliable conclusions about the accu- 
racy of our method, we have used in this study a large 
set of 30 base-pairs. Besides the common Watson-Crick 
guanine-cytosine and thymine-adenine pairs, we also con- 
sider a significant range of other configurations of the four 
bases guanine, cytosine, adenine and thymine (G, C, A. 
T). These are the same as those studied by Sponer et alM 
in their MP2 study. The Watson-Crick configurations are 
designated WC, and the Hoogsteen, reversed Hoogsteen 
and reversed Watson-Crick appear as H, RH, RWC re- 
spectively. Other configurations are distinguished simply 



with numbers, eg. AAl, AA2, etc. In assigning the num- 
bers to the pairs we-Jiave followed the nomenclature of 
Hobza and SandorfyE3, who classified the pairs in order of 
decreasing stability. Their ordering was not confirmed by 
later results (including ours), but the convention is nev- 
ertheless maintained to simplify comparisons and avoid 
confusion. The structures of the bases and base-pairs 
studied in this work can be found in Figures 1 and 2 of 
Ref. G^. For the numbering of the atoms we followed 
Ref. 

In the calculation of the energetics of the base-pairs, we 
have analyzed two different quantities. First, the inter- 
action energy Eint , defined as the energy of the base-pair 
minus the energy of each base with the same geometry 
it has in the pair. Second, the total stabilization energy, 
Et, defined as the difference between the energy of the 
pair and that of each base in its isolated optimal geome- 
try. Therefore, the difference between Et and E^t is the 
deformation energy, i.e., the increase in intramolecular 
energy due to the geometry change when the base-pair is 
formed. 

Due to the finite size of the bases used, both Eint 
and Et have pto be corrected for the basis set super- 
position errorCZl (ESSE). In this work, all the energies 
have been corrected for ESSE, as described in the follow- 
ing. For the interaction energy, we have used the stan- 
dard Eoys-Eernardi counterpoise correctiorO: the ESSE 
is calculated as the difference between the energies of 
the isolated bases obtained with the orbitals of the base 
alone, and with the "ghost" orbitals of the other base: 
ESSE = E{A) + E{B) - E{A*) - E{B*) (where the as- 
terisks indicate the inclusion of the orbitals of the other 
base in the calculation). The same correction is used for 
the stabilization energy Et. Since Et contains the defor- 
mation energy Edef , this approach is valid only if Edef is 
not much affected by the ESSE {i.e., if the change in the 
ESSE is not large when calculated with the relaxed iso- 
lated bases geornelry instead of the coordinates of each 
base in the pair)Ej. We tested this and found that the 
variation in ESSE calculated with these two geometries is 
only about 10% of the total ESSE value, so we will con- 
sider that the ESSE correction defined above is as valid 
for Et as for Ei^t- 

The structural relaxations were done by means of a 
conjugate gradient minimization of the energy, until the 
forces on all atoms were smaller than 0.04 eV/A. No con- 
straints were imposed in the relaxation, except the pla- 
narity of the base-pairs. This constraint was imposed in 
order to facilitate comparison with the results of Sponer 
et al, who also analyze planar bases and base-pairs. In 
the relaxations, forces ar£-calculated as analytical deriva- 
tives of the total energyEEl. No ESSE correction was in- 
cluded in the forces. This would lead to problems if the 
ESSE had an important variation with atomic positions, 
since in that case the relaxed geometries obtained with- 
out the ESSE correction would not correspond to the 
minimum of the total energy including the ESSE correc- 
tion. In order to check this, corrected and not corrected 
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interaction energies of the AAl base-pair were calculated 
as a function of the distance, separating the molecules 
rigidly in the H bonds direction. As expected, the BSSE 
is more important in absolute value as the two bases 
are brought closer, and decreases as the bases are sep- 
arated. However, the results show that this variation 
does not appreciably affect neither the equilibrium dis- 
tance (the acceptor-hydrogen distance was 1.95 A for the 
uncorrected curve and 1.97 A for the corrected one) nor 
the vibrational frequencies. 

Finally, it is worth mentioning the computational re- 
quirements of the calculations presented here. All the 
computations were performed in a PC with a 120 MHz 
Pentium processor. For the relaxations of the base-pairs, 
the calculations took an average of about 4 hours of CPU 
time for each relaxation step. The memory usage was be- 
low 100 Mb, and virtually no disk use was neccesary (all 
the integrals being stored in memory). This shows the 
efficiency of the code, and the possibility of studying rel- 
atively large systems in very modest platforms. 



III. ENERGIES AT HF/6-31G** GEOMETRIES 

In order to demonstrate the accuracy and validity 
of our method for the description of the energies of 
hydrogen-bonded base-pairs, we need a reference with 
which to compare our results. Since the experimental in- 
formation on energies and structures of isolated bases and 
base-pairs is very scarce, we have used the results from 
former ab-initio calculations as a benchmark. There is a 
large amount of work done in-these systems in the con- 
text of ab-initio QC methods.E£l Probably the most com- 
plete and sophisticated calculations are those performed 
by Sponer and coworkers, Ell using the MP2 method. This 
usually covers a substantial part of the correlation energy, 
and is the most accurate correlation technique that can 
be applied at present for systems of the size of a few 
tens of atoms, like the base-pairs. However, these cal- 
culations are still computationally expensive, and there- 
fore they can only be done using medium sized bases 
(typically 6-31G**) and fixed geometries obtained with 
simpler schemes like HF. Geometry optimizations at the 
MP2 level have only been possible for the taaallcst, high- 
est symmetry base-pair (cytosine-cytosine)EJ. 

We will therefore discuss the interaction energies ob- 
tained with SIESTA for the bas^^airs in the HF/6- 
31G** geometries of Sponer et aZ.Eil, and compare with 
the corresponding MP2 results. The data are shown in 
Table H GG2 and GC2 base -pairs are not included in 
this table because at this level of relaxations these pairs 
are not stable and converge to the configurations of GGl 
and GjGNEW, respectively. We compare with the MP2 
resulta22l evaluated on the same geometry. We also show 
the percentage deviation between both results. For all 
the base-pairs except GG4, the agreement is consider- 
ably good, with differences smaller than 8% and much 



less in most cases. GG4 seems to be an exception to the 
general trend as its difference with the MP2 value is 26%. 
We tried to see if this was a problem of the basis set and 
made calculations with larger cut-off radii for the atomic 
orbitals. For an energy shift of 10 meV the interaction 
energy was -8.4 kcal/mol, so the deviation with respect 
to the MP2 results is reduced to 16%, but it is still far 
larger than for the rest of the base-pairs (for other base- 
pairs, the difference in the interaction energy for the 50 
and 10 meV energy shift bases is much smaller than in 
the GG4 case). 

The standard deviation of our results compared to 
the MP2 values is of 0.73 kcal/mol. It is interesting to 
compare these, results with the DFT values obtained by 
Sponer et aZ.EJ for the same HEi6-31G** geometries, us- 
ing the Becke3LYP functionalo. The largest deviation 
of their results is of 1.3 kcal/mol (11 % of Eint) for the 
TCI pair, while the standard deviation from the MP2 re- 
sults is accidentally the same as ours: 0.73 kcal/mol. We 
can therefore conclude that the results obtained for the 
energies at fixed geometries using the PBE functional are 
of similar degree of accuracy as those obtained by other 
authors with other GGA functionals. The standard devi- 
ation between our results and the DFT data of Sponer et 
al. is about 0.85 kcal/mol, of the same order as the dif- 
ference with the MP2 results. This serves to validate the 
PBE functional, as well as SIESTA and the approxima- 
tions involved (cut-off bases, pseudopotentials, grid inte- 
grations, etc), as a valuable and competitive tool com- 
pared with standard, all-electron, gaussian-bases DFT 
programs. 

The current LDA and GGA implementations of DFT 
are not able to describe accurately Van der Waals or dis- 
persion interactions. Still, the previous results show that 
the common non-local XC functionals provide quantita- 
tively accurate values of the interaction energies of the 
hydrogen-bonded base-pairs. Although the reason for 
this is not fully clear yet, it seems that the dispersion en- 
ergies in these H-bonded systems are significantly smaller 
than the ones that would be predicted using an empirical 
London dispersion energyEil. It is still unclear if current 
XC functionals are able to describe the interaction be- 
tween stacked base-pairs, where the dispersion energy is 
larger than expected from an empirical London formula. 
Calculations by Sponer and coworkersQ seem to indicate 
that the Becke3LYP functional significantly underesti- 
mates the dispersion energies for stacked base pairs. A 
study of the performance of other XC functionals with 
our method is underway, and will help in clarifying this 
issue. 



To conclude this section, we show in Table [II the 



dipole moments for the HF/6-31G** geometries. We 
compare thcr^esults of the HF/6-31G** calculations of 
Sponer et alcA with those of this work. DFT provides 
lower values, due to the tendency of the Hartree-Fock 
approximation to overestimate the electrostatic interac- 
tions. 
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IV. DFT GEOMETRY OPTIMIZATIONS 

At present, DFT studies of the geometries of base-pairs 
are rather scarce. To our knowledge, only the geometries 
of the Watson-Crick TA and GC, and the CC base-pairs 
have been obtained with DFT. Therefore, we have ex- 
tended the previous DFT works, and performed a thor- 
ough study of the energetics and structural relaxations 
of nitrogenated base-pairs with our DFT approach. Here 
we describe the results of such structural optimizations 
for the 30 hydrogen-bonded base-pairs (27 of which had 
not been studied with DFT previously). 



A. Isolated bases 

We have first optimized the geometries of isolated ade- 
nine, guanine, cytosine and thymine using our method. 
The results for the geometries obtained (bon d distances 

For com- 



and angles) are presented in Tables [V-VII 
parison, we 
and V|aaque: 



o show the DFT results of Santamari'a 
(S-V), obtained with the Vosko-Wilk- 
NusairEj functional with Becke-PerdewO non-local cor- 
rections, as well as the experimental values-obtained with 
X-ray diffraction for crystallized DNAe3o. The bases 
were relaxed for planar geometries. 

We can observe that the geometries obtained here are 
close to those of S-V with an all-electron calculation with 
gaussian bases and a different XC functional. Again, this 
supports the reliability of SIESTA and its approxima- 
tions. The bond distances obtained in this work are usu- 
ally slightly larger than the values of S-V, although the 
largest difference is only 0.016 A. The results of S-V are 
slightly closer to the experimental data. These must be 
taken only as a rough reference, since they correspond to 
measurements of DNA crystals; packing forces constrain 
the molecules, so all the bond distances are shorter than 
those calculated for the free bases. Some of the differ- 
ences between our results and those of S-V are due to 
the restriction of planarity in our calculation, which was 
not imposed by S-V. It is well known that the amino 
groups of the nucleic acid bases S| Uffcij a pyramidalization 
due to partial sp^ hybridizationt3E3: the two H atoms 
go out of the plane of the aromatic ring whereas the N 
atom moves in the opposite direction. Therefore, there 
are in some cases important differences in the distances 
and angles that involve atoms in the amino groups of the 
bases between the planar and non-planar bases. 



B. Base- pairs 

Here we discuss our results for the structural optimiza- 
tions of the base-pairs performed with the SIESTA pro- 
gram. Our relaxations start from HF/6-31G** geome- 
tries, with the exception of GG2 and GC2 base-pairs, 
(not stable at the HF/6-31G** level) for which we start 



from the HF/MINI-1 coordinates. In all cases, planar 
symmetry was imposed in the relaxations. 

Table VIII| shows the H-bonds distances and angles 
for the base-pairs obtained with our method. Donor- 
acceptor and donor-hydrogen distances are shown, to- 
gether with the angle subtended by the three atoms in- 
volved in the bond. Comparing with HF/6-31G** ge- 
ometries (see Ref. |2l|), we see that hydrogen bridges are 
shorter and donor-H distances are larger in our calcula- 
tions. 

Among the hydrogen bonds in the base-pairs studied, 
33 are N(H)- • -N and 29 are O- • •(H)N. Their distances 
range from 2.755 to 3.169 A. Among the longest of them, 
there is a clear majority of N(H)- ■ -N, which seems logical 
because the electrostatic attraction between the H atom 
and a N atom should be weaker than the attraction be- 
tween H and O. However, there are also some N(H)- • -O 
bonds that arc quite long. The reason is that several fac- 
tors and interactions, and not only the atoms which are 
involved in the H-bond, influence the final configuration 
of the pairs. 

Distances between donor and H atoms range from 
1.026 to 1.070 A. There is in many cases a correlation 
between short H-bond distances and long donor-H dis- 
tances. It is clear that the greater the electrostatic attrac- 
tion the H atom suffers from the other base, the longer 
will be its distance from the donor atom and vice-versa. 
However, again this cannot be taken as a strict rule, as 
the final position of each atom is determined by all the 
neighboring atoms. 

The dipole moments at the geo metries relaxed with 
our approach are shown in Table III . They are all smaller 
than the HF/6-31G** values, except for two of the bases, 
and there are no major differences with the results ob- 
tained with SIESTA for the HF/6-31G** geometries, al- 
though there is in almost all cases a slight increase in 
their values. 



Table IX shows the interaction (i?i„t) and total sta- 
bilization (Et) energies for the base-pairs, obtained at 
the SIESTA relaxed coordinates. The ordering of the 
base-pairs is the same as the one of Table || to facili- 
tate comparison. Interaction energies range from -32.2 
to -9.6 kcal/mol, and stabilization energies from -27.7 to 
-8.0 kcal/mol. The most stable_hase-pair is GCWC, in 
agreement with previous resultsElfu, and the ordering of 
the next three base-pairs is also the same. The relative 
ordering (according to Ei„t) of all the TA, GT, GG, GA, 
AA and TT tautomeres is conserved, but AC2 is more 
stable than ACl and TC2 more stable than TCI with 
our method. It is interesting to see that the energetic 
ordering depends on which of the two energies is used., 
E,nt or Et. This is not the case in MP2//HF resultsEl, 
where only two base-pairs change position when ordered 
according to Eint instead of Et (although the ordering 
could change further if the geometries were obtained at 
the MP2 level, too). 

Several points are worth noticing from the results of 
Table IX . (i) The interaction energies are systematically 
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larger (by a few kcal/mol) than those obtained, with the 
same theory, at the HF/6-31G** geometries (compare 
with the results of Table ||). This indicates that the 
geometry of the H-bonded base-pairs configurations are 
more sensitive to the details of the calculation than those 
of the free bases, and that the HF/6-31G** geometries for 
the H-bonds are not fully optimal for DFT calculations, 
(ii) The deformation energies (difference between Et and 
Eint) are larger than those obtained at the HF/6-31G** 
and MP2 levels for HF/6-31G** geometries. In our cal- 
culations, the deformation energy ranges up to as much 
as 6.7 kcal/mol for the GCNEW pair. The same trend is 
observed in other DFT calculations (see below), (iii) The 
GG2 and GC2 base-pairs, which at the HF/6-31G** are 
unstable and converge towards the GGl and GCNEW 
respectively, are found to be stable in our geometry op- 
timization, which was started from HF/MINI-1 coordi- 
nates. The interaction energy of these pairs is small, but 
comparable to many of the other base-pairs. 

To our knowledge, DFT optimizations of base-pairs 
by othep-authors are only ava ilabl e for GCWC,TAWC 
and CCBEZl. Tables §, g and ^ present the available 
DFT results together with those of SIESTA and HF/6- 
31G**, for these three pairs. For CC, results of a MP2 
geometry optimization are also available from the work of 
Sponer et aZ.tj, and are included in Table XII. Acceptor- 
donor and donor-hydrogen distances are shown, together 
with the interaction and stabilization energies. In gen- 
eral, our results report donor-acceptor distances which 
are only slightly shorter than those of other DFT calcu- 
lations (with a maximum difference of 0.058 A), whereas 
the donor-H distances are in excellent agreement. All 
DFT results yield shorter D-A distances than the IIF/6- 
31G** approximation. For the energies, the dispersion in 
the DFT reports is considerable—Our results agree quite 
well with those of Sponer et aZO (available for GCWC 
and CC), whereas [tie differences with those of Santa- 
mar fa and VazquezEII (available for GCWC and TAWC) 
are larger. We note that, as mentioned before, defor- 
mation energies are considerably larger in all the DFT 
results than in the MP2//IIF results. For instance, for 
GCWC, the DFT calculations yield values of the defor- 
mation energies of 3.3, 4.8 and 4.7 kcal/mol, whereas the 
MP2//HF result is of only 2.1 kcal/mol. 

It is interesting to discuss the case of CC, since it is 
the only base-pair where coordinates optimizations at the 
MP2 level are available. We see in Table KIl that the 



bond length between the donor and acceptor atoms is 
underestimated by the DFT calculations, and overesti- 
mated by the HF/6-31G** by about the same amount. 
However, it seems that the DFT energies are closer to the 
MP 2 values at the MP 2 geometries than at the HF ones. 
Also, the MP2 deformation energy increases from 1.3 
kcal/mol for the HF/6-31G** geometries to 1.8 kcal/mol 
for the MP2 geometries, and therefore approaching that 
obtained with DFT (2.3 kcal/mol in the results of Sponer 
et al. and 2.4 kcal/mol in our case). 



V. CONCLUSIONS 

In this work, we have performed DFT calculations on 
the DNA bases adenine, guanine, cytosine and thymine, 
and 30 base-pairs formed by these bases. The calcula- 
tions were performed with the SIESTA code, which is 
a novel technique for DFT calculations in systems with 
large numbers of atoms using pseudopotentials to de- 
scribe the effect of the core electrons and finite range 
basis orbitals for the valence electrons. The calculations 
presented here serve to validate our method for the study 
of H-bonded base-pairs, as a first step toward the com- 
plete DNA helix (which is feasible with SIESTA due to 
the linear scaling of the numerical effort with the number 
of atoms in the system) . 

For calculations on the HF/6-31G** geometries, ex- 
cellent agreement with MP2 results was obtained. The 
deviations are smaller than 8% (which amounts to 1.3 
kcal/mol at most), except for GG4, which differs quite 
significantly from MP2 results. Calculations with longer 
atomic orbital radii reduce the difference, but it is still 
bigger than the rest. The dipole moments for these ge- 
ometries are systematically lower than those of the HF 
calculations. 

For the isolated bases, the planar geometries obtained 
in our calculation are in good agreement with former 
DFT results. 

The relaxed geometries of the 30 DNA base-pairs were 
also obtained with our method. The donor-acceptor dis- 
tances in the hydrogen bonds are systematically shorter 
than^|hcjse of HF/6-31G**, as in other DFT calcula- 
tionsEZlO. Our results compare well with other DFT op- 
timizations of the GCWC and TAWC base-pairs. For the 
CC pair, for which MP2 optimizations are available, the 
results of SIESTA and other DFT calculations slightly 
underestimate the hydrogen bond distances, but provide 
a quite accurate value for the interaction energies. The 
deformation energies upon the dimer formation are larger 
for the DFT results than for the HF geometries, a result 
that is in agreement with the increase of Edef in the 
MP2 approximation when MP2 coordinates are used in 
the calculation. 

Dipole moments for the relaxed geometries are quite 
similar to our previous results for HF geometries, but 
slightly larger in most cases. 

The results for the energetic ordering of the base-pairs 
have also been analyzed. Although there arc not essen- 
tial changes, the ordering is slightly different than for the 
MP2//HF results. However, the relative order between 
tautomers is conserved in most cases. The GG2 and GC2 
base-pairs, which were unstable at the HF/6-31G** level, 
are stable in our calculations, and have interaction ener- 
gies similar to the other base-pairs. 

In conclusion, the results presented here show that 
SIESTA is a valuable tool for the study of H-bonded DNA 
base-pairs. It provides results very similar to other DFT 
techniques, and which compare very well with the avail- 
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able MP2 data. Work is under progress to determine the 
validity of the method for the properties of stacked bases, 
and for the study of large DNA segments. 
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TABLE I. Atomic Orbitals radii (Bohr) for an energy shift 
of 50 meV. For each L shell, ^ stands for each of the split 
valence orbitals. 





L= 





L= 


= 1 


Species 


C=i 


C=2 


C=i 


C=2 


H 


6.047 


2.488 






C 


4.994 


3.475 


6.254 


3.746 


N 


4.390 


2.942 


5.496 


3.092 





3.937 


2.542 


4.931 


2.672 


TABLE II. 


Base-pair 


Interaction 


Energies 


{Eint, in 


kcal/mol) at HF/6-31G** 


geometries. 






Pair 


MP2'' 


OirjO l-r\ 


Deviation (%) 


GCWC 


-25.8 


-26.8 




-o.y 


GGl 


-24.7 


-zo.i 




-i.O 


GCNEW 


-22.2 


-11. i 




9 9 


CC 


-18.8 


-17.5 




O.y 


GG3 


-17.8 


-16.0 




O. ( 


GAl 


-15.2 


-lo.o 




9 n 


GTl 


-15.1 


1 CC n 




n 7 


GT2 


-14.7 


-14.0 




1.4 


ACl 


-14.3 


-14. U 




2.1 




-14. d 


-14. i 




-2.8 


AC2 


-14.1 


-14. ( 




-4.2 


GAS 


-13.8 


-lo.o 




0.0 


TAH 


-13.3 


1 Q 7 

-lo. ( 




-3.0 


TARH 


-13.2 


-lo.o 




-3.0 


TAWC 


-12.4 


1 O Q 




0.8 


TARWC 


-12.4 


-Iz.o 




0.8 


AAl 


-11.5 


-11. r 




-1.7 


GA4 


-11.4 


117 

-11. / 




-2.6 


TC2 


-11.6 


1 O Q 




7.5 


TCI 


-11.4 


1 o 
-lU.O 




7.0 


AA2 


-11.0 


-11.4 




-3.6 


TT2 


-10.6 


-9.9 




6.6 


TTl 


-10.6 


-10.1 




4.7 


TT3 


-10.6 


-10.2 




3.8 


GA2 


-10.3 


-10.6 




-2.9 


GG4 


-10.0 


-7.4 




26.0 


AA3 


-9.8 


-9.8 




0.0 


2 amino AT 


-15.1 


-15.2 




-0.7 



TABLE 


III. Dipol 


e moments 


(Debyes) of 


the DNA 


base-pairs'' 










Pair 


Hr / /xir 


bl-bblA/ /tit 


Difference I 


70 ) bl-bo 1/ 


GCWC 


6.5 


5.8 


-10.8 


6.1 




0.0 


0.0 


0.0 


0.0 


GCNEW 


3.1 


3.4 


9.7 


3.3 


CC 


0.0 


0.0 


0.0 


0.0 


GG3 


10.5 


10.3 


-1.9 


10.9 


GAl 


5.6 


4.7 


-16.1 


4.9 


GTl 


7.7 


6.9 


-10.4 


7.0 


GT2 


8.6 


8.0 


-7.4 


8.3 


ACl 


4.8 


3.5 


-14.6 


3.7 


GCl 


12.7 


10.7 


-15.7 


11.5 


AC2 


9.7 


8.3 


-14.4 


8.6 


GA3 


8.8 


7.9 


-10.2 


8.4 


TAH 


6.4 


5.5 


-14.1 


5.7 


iAKM 


5.9 


5.0 


-15.2 


5.0 


iAWC 


2.0 


1.4 


-30.0 


1.4 


TARWC 


2.5 


2.3 


-8.0 


2.4 


AAl 


0.0 


0.0 


0.0 


0.0 


GA4 


9.2 


8.2 


-10.9 


8.8 


TC2 


4.5 


3.9 


-13.3 


3.8 


TCI 


5.9 


5.3 


-10.2 


5.5 


A A n 

AA2 


4.9 


4.7 


-4.1 


4.8 


TT2 


0.0 


0.0 


0.0 


0.0 


TTl 


1.3 


1.3 


0.0 


1.6 


TT3 


0.0 


0.0 


0.0 


0.0 


GA2 


7.3 


6.4 


-12.3 


6.8 


GG4 


0.0 


0.0 


0.0 


0.0 


AA3 


0.0 


0.0 


0.0 


0.0 


2 amino AT 


4.2 


4.0 


-4.7 


4.2 


GG2 








12.7 


GC2 








13.8 


^ HF//HF: Hartree-Fock results obtained at HF/6-31G** 
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coordinates. From ref. 
SIESTA//HF: results of this work, obtained at HF/6- 
31 G** geometries. 

Difference: Percent difference between HF//HF and 
SIESTA//HF results. 

SIESTA: results of this work, calculated at the SIESTA 
relaxed coordinates. 



From ref. 21 
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TABLE IV. Bond distances and angles for isolated adenine^ 



Distances (A) 



This work 



Ref. 27 



Exp. 



Angles (deg) 



This Work 



Ref. n 



Exp. 



C8-N9 
N9-C4 
C5-N7 
N7-C8 
C4-N3 
N3-C2 
C2-N1 
N1-C6 
C6-N6 
C6-C5 
C4-C5 
C8-H8 
C2-H2 
N9-H9 
N6-H61 
N6-H62 



1.391 
1.390 
1.394 
1.334 
1.355 
1.357 
1.362 
1.360 
1.368 
1.431 
1.420 
1.098 
1.108 
1.017 
1.020 
1.021 



1.387 
1.386 
1.394 
1.324 
1.349 
1.348 
1.354 
1.355 
1.371 
1.418 
1.409 
1.093 
1.098 
1.022 
1.020 
1.020 



1.367 
1.376 
1.385 
1.312 
1.342 
1.332 
1.338 
1.349 
1.337 
1.409 
1.382 



C8-N9-C4 
N9-C4-C5 
C5-N7-C8 
N7-C8-N9 
C5-C4-N3 
C4-N3-C2 
N3-C2-N1 
C2-N1-C6 
C5-C6-N6 
N1-C6-C5 
C4-C5-N7 
N7-C8-H8 
N3-C2-H2 
C8-N9-H9 
C6-N6-H61 
C6-N6-H62 



107.30 
103.67 
103.35 
113.40 
127.53 
110.66 
129.26 
117.92 
121.14 
119.51 
112.28 
125.17 
115.52 
127.31 
118.87 
119.52 



106.74 
104.50 
103.75 
113.49 
126.98 
110.82 
129.21 
118.07 
122.33 
118.96 
111.52 
125.00 
115.58 
127.64 
116.04 
117.56 



105.9 
105.7 
103.9 
113.8 
126.9 
110.8 
129.0 
118.8 
123.4 
117.6 
110.7 



thepPBE^ functional; Ref. ^ 



^ This work: DFT geometries obtained with SIESTA, using 

Santamaria and Vazquez using the VWNl3 functional with BPEI non-local corrections; Exp 
from crystallized DNA (refs. and Esh. 



eometries obtained by 
.: experimental values 



TABLE V. Bond distances and angles for isolated guanine'*. 



Distances (A) 


This work 


Ref. 15 


Exp. 


Angles (deg) 


This work 


Ref. 


Exp. 


C2-N1 


1.382 


1.379 


1.375 


C2-N1-C6 


126.77 


126.68 


124.9 


N1-C6 


1.449 


1.448 


1.393 


N1-C6-C5 


109.40 


109.50 


111.7 


C4-N3 


1.371 


1.366 


1.355 


C4-N3-C2 


111.66 


112.24 


111.8 


N3-C2 


1.335 


1.324 


1.327 


N3-C2-N1 


124.17 


123.52 


124.0 


C2-N2 


1.378 


1.391 


1.341 


N1-C2-N2 


116.83 


117.2 


116.3 


C4-N9 


1.386 


1.380 


1.377 


C4-N9-C8 


106.72 


106.73 


106.0 


N9-C8 


1.398 


1.392 


1.374 


N9-C8-N7 


113.20 


112.85 


113.5 


C8-N7 


1.329 


1.321 


1.304 


C8-N7-C5 


104.19 


104.49 


104.2 


N7-C5 


1.391 


1.389 


1.389 


N7-C5-C4 


111.31 


111.01 


110.8 


C6-C5 


1.460 


1.446 


1.415 


C6-C5-C4 


118.39 


118.53 


119.1 


C5-C4 


1.423 


1.407 


1.377 


C5-C4-N3 


129.61 


129.51 


128.4 


C6-06 


1.237 


1.234 


1.239 


C5-C6-06 


131.48 


131.46 


128.3 


Nl-Hl 


1.025 


1.025 




C2-N1-H1 


120.11 


120.06 




N2-H21 


1.016 


1.023 




C2-N2-H22 


117.27 


112.9 




N2-H22 


1.017 


1.023 




C2-N2-H21 


122.36 


116.37 




C8-H8 


1.098 


1.092 




N7-C8-H8 


124.76 


125.32 




N9-H9 


1.025 


1.022 




C8-N9-H9 


127.56 


128.11 





Same as in Table 



[V, 



TABLE VI. Bond distances and angles for isolated cytosine'' 



Distances (A) 



This work 



Ref. 27 



Exp. 



Angles (deg) 



This work 



Ref. 27 



Exp. 



N3-C2 
C2-N1 
N1-C6 
C4-N3 
C4-N4 
C6-C5 
C5-C4 
C2-02 
N4-H41 
N4-H42 
Nl-Hl 
C5-H5 
C6-H6 



1.386 
1.444 
1.366 
1.341 
1.374 
1.385 
1.456 
1.240 
1.020 
1.023 
1.027 
1.100 
1.101 



1.379 
1.439 
1.363 
1.332 
1.378 
1.371 
1.445 
1.236 
1.019 
1.022 
1.023 
1.094 
1.096 



1.356 
1.399 
1.364 
1.334 
1.337 
1.337 
1.426 
1.237 



N3-C2-N1 
C2-N1-C6 
N1-C6-C5 
C4-N3-C2 
N4-C4-N3 
C6-C5-C4 
C5-C4-N3 
N3-C2-02 
H41-N4-H42 
C4-N4-H42 
C2-N1-H1 
C4-C5-H5 
N1-C6-H6 



116.77 
123.38 
119.91 
119.73 
116.72 
115.87 
124.33 
125.52 
120.31 
118.29 
115.13 
122.99 
117.23 



116.47 
123.38 
119.71 
119.76 
116.57 
116.18 
124.43 
125.68 
116.37 
114.91 
115.23 
122.62 
117.29 



118.9 
120.6 
121.0 
120.0 
117.9 
117.6 
121.8 
121.9 



° Same as in Table EV 



TABLE VII. Bond distances and angles for isolated thymine'' 



Distances (A) 


This work 


Ref. p7| 


Exp. 


Angles (deg) 


This work 


Ref. |2^ 


Exp. 


C4-N3 


1.420 


1.417 


1.413 


C4-N3-C2 


128.46 


128.17 


126 


N3-C2 


1.399 


1.393 


1.345 


N3-C2-N1 


112.57 


112.57 


118 


C2-N1 


1.407 


1.398 


1.314 


C2-N1-C6 


123.77 


123.70 


123 


N1-C6 


1.386 


1.387 


1.408 


N1-C6-C5 


122.72 


122.71 


120 


C6-C5 


1.379 


1.364 


1.369 


C6-C5-C4 


118.10 


118.12 


119 


C5-C4 


1.481 


1.470 


1.476 


C5-C4-N3 


114.37 


114.69 


114 


C5-CM 


1.510 


1.506 


1.522 


C4-C5-CM 


118.03 


117.97 


119 


C2-02 


1.235 


1.233 


1.246 


N1-C2-02 


123.19 


123.14 


122 


C4-04 


1.241 


1.238 


1.193 


04-C4-N3 


120.35 


120.07 


121 


N3-H3 


1.028 


1.025 




C2-N3-H3 


115.46 


115.59 




Nl-Hl 


1.024 


1.022 




C6-N1-H1 


120.97 


121.17 




C6-H6 


1.097 


1.096 




C5-C6-H6 


122.17 


122.16 




CM-HMl 


1.108 


1.105 




C5-CM-HM1 


110.61 


111.10 




CM-HM2 


1.107 


1.104 




C5-CM-HM2 


110.63 


110.00 




CM-HM3 


1.104 


1.102 




C5-CM-HM3 


111.29 


111.29 





Same as in Table [V 
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TABLE VIII. H-bonds distances (in A) and angles for 
SIESTA optimization of DNA base-pairs. D-A and D-H 
arc the donor-acceptor and donor-hydrogen distances, respec- 
tively. 

Pair Bond D-A D-H Angle 



2aminoAT 


N6(H)- • -06 


2.921 


1.034 


176.95 




N3- ■ 


■(H)N1 


2.955 


1.069 


178.95 




N2(H)- ■ -04 


z.yyo 


i.UZ ( 


1 iO.Z i 


GG2 


06- • 


•(H)N1 


2.960 


1.035 


175.10 




N7- • 


•(H)N2 


3.114 


1.026 


167.68 


GC2 


02- • 


•(H)N1 


2.917 


1.040 


174.15 




N3- ■ 


■(H)N2 


3.139 


1.027 


177.41 


TABLE IX. Relaxed Interaction and Stabilization Energies 


(kcal/mol) for the relaxed base-pair structures. 




Base Pair 






Eint 




Et 


GCWC 






-32.2 




-27.6 


GGl 






-30.1 




-26.7 


GCNEW 






-26.3 




-19.6 


cc 






-21.1 




-18.5 


GG3 






-18.2 




-17.7 


GAl 






-17.9 




-16.4 


GTl 






-18.9 




-17.8 


GT2 






-17.8 




-16.8 


ACl 






-16.4 




-13.8 


GCl 






-18.0 




-16.0 


AC2 






-17.9 




-15.7 


GAS 






-16.5 




-16.2 


TAH 






-17.6 




-15.6 


TARH 






-16.4 




-14.1 


TAWC 






-16.3 




-14.2 


TARWC 






-15.1 




-14.2 


AAl 






-14.2 




-13.7 


GA4 






-14.2 




-13.6 


TC2 






-13.8 




-11.5 


TCI 






-12.2 




-10.6 


AA2 






-13.7 




-13.0 


TT2 






-13.1 




-10.9 


TTl 






-12.9 




-10.7 


TT3 






-12.6 




-11.2 


GA2 






-12.9 




-11.7 


GG4 






-9.6 




-8.0 


AA3 






-11.6 




-10.7 


2aminoAT 






-19.6 




-16.8 


GG2 






-13.1 




-12.4 


GC2 






-12.3 




-10.2 



GCWC 


N2(H)- ■ -02 




N1(H)- • -m 




06- • •(H)N4 


GGl 


N1(H)- ■ -06 




06- • •(H)N1 


GCNEW 


N1(H)- • -02 




06- • ■(H)N1 


CC 


N4(H)- ■ -NS 




N3- ■ •(H)N4 


GG3 


06- • •(H)N2 




N7- • -(H^Nl 


GAl 


Nl- ■ •(H)N1 




N6(H)- ■ -06 


GTl 


N1(H)- ■ -04 




06- • •(H)N3 


GT2 


02- • -(H^Nl 




N3(H)- ■ -06 


ACl 


N3- ■ ■(H)N6 




N4(H)- ■ -Nl 


GCl 


N3- • •(H)N2 




N4(H)- • -NS 


AC2 


N6(H)- ■ -NS 




N7- ■ •(H)N4 


GA3 


N7- ■ -(H^Nl 




N6(H)- ■ -06 


TAH 


N3(H)- ■ ■N7 




04- • ■(H)N6 


TARH 


02- • ■(H)N6 




N3(H)- • •N7 


TAWC 


Nl- ■ ■(H)N3 




N6(H)- ■ -04 


TARWC 


N6(H)- ■ -02 




Nl- • ■{}l)m 


AAl 


Nl- • -(H)N6 




N6(H)- - -Nl 


GA4 


N3- - -(H)N6 




N2(H)- - -Nl 


TC2 


N3(H)- - -N3 




04- - •(H)N4 


TCI 


N4(H)- - -02 




N3- - -(H)N3 


AA2 


N7- - -(H)N6 




N6(H)- - -Nl 


TT2 


N3(H)- • -04 




04- - -(H)N3 


TTl 


N3(H)- - -04 




02- - -(H)N3 


TT3 


N3(H)-- -02 




02- - -(H)N3 


GA2 


N6(H)- - -N3 




N7- - -(H)N2 


GG4 


N3- - -(H)N2 




N2(H)- • -m 


AA3 


N7- • •(H)N6 




N6(H)- • •N7 



2.872 


1.036 


178, 


,10 


2.913 


1.057 


175, 


.98 


2.770 


1.057 


179, 


.98 


2.755 


1.057 


174, 


,67 


2.756 


1.057 


174, 


.59 


2.763 


1.052 


173, 


.10 


2.824 


1.054 


178, 


.79 


2.872 


1.057 


173, 


.41 


2.872 


1.057 


173, 


,41 


3.169 


1.026 


167, 


.00 


2.864 


1.043 


171, 


.91 


3.103 


1.042 


179, 


.82 


2.844 


1.044 


179, 


.72 


2.797 


1.048 


179, 


,40 


2.839 


1.058 


175, 


.18 


2.843 


1.038 


178, 


.18 


2.874 


1.064 


173, 


,51 


3.007 


1.039 


173, 


,69 


3.046 


1.044 


176, 


,89 


2.873 


1.049 


178, 


.50 


3.093 


1.045 


175, 


.77 


2.957 


1.041 


169, 


,11 


2.994 


1.048 


177, 


,18 


3.137 


1.044 


175, 


,92 


2.806 


1.042 


164, 


.83 


2.828 


1.066 


175, 


,94 


2.991 


1.035 


170, 


,84 


3.041 


1.027 


169, 


,67 


2.861 


1.060 


176, 


.67 


2.859 


1.070 


179, 


.35 


2.946 


1.039 


174, 


.35 


3.006 


1.033 


171, 


,08 


2.890 


1.061 


177, 


.89 


3.049 


1.041 


177, 


.53 


3.049 


1.041 


177, 


,53 


3.088 


1.034 


173, 


,88 


2.963 


1.044 


179, 


,31 


3.065 


1.053 


168, 


.05 


2.822 


1.041 


176, 


.27 


2.879 


1.036 


174, 


,26 


3.149 


1.044 


165, 


,17 


3.051 


1.037 


176, 


,28 


3.062 


1.040 


166, 


.63 


2.872 


1.046 


172, 


.49 


2.872 


1.046 


172, 


,48 


2.885 


1.049 


169, 


,12 


2.876 


1.050 


170, 


,11 


2.881 


1.049 


168, 


.35 


2.881 


1.049 


168, 


.33 


3.146 


1.027 


166, 


,86 


3.006 


1.040 


173, 


,24 


3.056 


1.037 


179, 


.30 


3.059 


1.035 


179, 


.72 


3.070 


1.031 


159, 


.56 


3.070 


1.031 


159, 


.57 
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TABLE X. Guanine-cytosine Watson-Crick base-pair". 







HF/6-31G** 


DFT (B3LYP) 


DFT (VWN-BP) 


SIESTA 


d(N2(H) 


■ ■ ■ 02) 


3.017 


2.930 


2.930 


2.872 


d(N2,H) 




1.001 


- 


1.035 


1.035 


a(N l(Jrl) 


lVTO\ 

■ • ■ N6) 


3.037 


2.920 


2.923 


2.913 


d(Nl,H) 




1.008 




1.051 


1.056 


d(06 ■ ■ ■ 


H(N4)) 


2.921 


2.780 


2.785 


2.770 


d(H,N4) 




1.007 




1.055 


1.057 


Eint 




-25.5 


-29.6 


-27.7 


-32.2 


Et 






-26.3 


-22.9 


-27.6 



^ HF/6-31G**: results obtained at the HF level, with a 6-SijG** basis. From ref. |TL 
DFT (B3LYP): DFT results obtained with the BeckeSLXPEl functional. Frora_ref~gl|. 

DFT (VWN-BP): DFT results obtained with the VWNEI functional with BPEI non-local corrections. From ref. ||. 
SIESTA: Present results. 



TABLE XL Thymine-adenine Watson- Crick base-pair" 



HF/6-31G** DFT (VWN-BP) SIESTA 

d(N6(H) ■ • • 04) 3.086 2.955 2.946 

d(N6,H) 0.999 1.037 1.039 

d(Nl(H) ■ • ■ N3) 2.988 2.66 2.859 

d(N3,H) 1.013 1.067 1.070 

Ei„t -12.4 -13.9 -16.3 

Et - -11.9 -14.2 

^ HF/6-31G**: resuhs obtained at the HF level, with ap6r31G** basis. Fromj:ef. 

DFT (VWN-BP): DFT resuhs obtained with the VWnN fimctional with BPEI non-local corrections. From ref. 
SIESTA: Present results. 



TABLE XII. Cytosine-cytosine base-pair" 







HF/6-31G** 


MP2//HF 


MP2 


DFT (B3LYP) 


SIESTA 


d(N4(H) ■ 


■ N3) 


3.050 


3.050 


2.980 


2.900 


2.872 


Eint 




-17.3 


-18.8 


-20.5 


-20.4 


-21.1 


Et 






-17.5 


-18.7 


-18.1 


-18.5 



^ HF/6-31G**: resuhs obtained at the HF level, with a 6-31G** basis. From ref. 21. 
MP2//HF: results obtained at the MP2 level, with the HF geometries. From ref. 21. 
MP2: resuhs obtained at the MP2 level, with MP2 geometdjss. From ref. |l] 
DFT (B3LYP): DFT resuhs obtained with the Becke3LYPEl functional. From ref. |l| 
SIESTA: Present resuhs. 
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